One of the main limitations of the commonly used Absolute Trajectory Error (ATE) is that it is highly sensitive to outliers. As a result, in the presence of just a few outliers, it often fails to reflect the varying accuracy as the inlier trajectory error or the number of outliers varies. In this work, we propose an alternative error metric for evaluating the accuracy of the reconstructed camera trajectory. Our metric, named Discernible Trajectory Error (DTE), is computed in four steps: (1) Shift the ground-truth and estimated trajectories such that both of their geometric medians are located at the origin. (2) Rotate the estimated trajectory such that it minimizes the sum of geodesic distances between the corresponding camera orientations. (3) Scale the estimated trajectory such that the median distance of the cameras to their geometric median is the same as that of the ground truth. (4) Compute the distances between the corresponding cameras, and obtain the DTE by taking the average of the mean and root-mean-square (RMS) distance. This metric is an attractive alternative to the ATE, in that it is capable of discerning the varying trajectory accuracy as the inlier trajectory error or the number of outliers varies. Using the similar idea, we also propose a novel rotation error metric, named Discernible Rotation Error (DRE), which has similar advantages to the DTE. Furthermore, we propose a simple yet effective method for calibrating the camera-to-marker rotation, which is needed for the computation of our metrics. Our methods are verified through extensive simulations.
translated by 谷歌翻译
我们提出了一种用于多旋转平均的新型等级方法,称为Hara。我们的方法基于Triplet支持的层次逐渐初始化旋转图。关键的想法是通过利用许多强三联网支持的边缘和逐渐添加具有较弱和更少支持的边缘来构建生成树。这降低了在生成树中添加异常值的风险。因此,我们获得了一个强大的初始解决方案,使我们能够在非线性优化之前过滤异常值。通过更新的修改,我们的方法还可以集成有效的2D-2D对应关系的数量。我们对合成和实际数据集进行广泛的评估,证明了最先进的结果。
translated by 谷歌翻译
The network trained for domain adaptation is prone to bias toward the easy-to-transfer classes. Since the ground truth label on the target domain is unavailable during training, the bias problem leads to skewed predictions, forgetting to predict hard-to-transfer classes. To address this problem, we propose Cross-domain Moving Object Mixing (CMOM) that cuts several objects, including hard-to-transfer classes, in the source domain video clip and pastes them into the target domain video clip. Unlike image-level domain adaptation, the temporal context should be maintained to mix moving objects in two different videos. Therefore, we design CMOM to mix with consecutive video frames, so that unrealistic movements are not occurring. We additionally propose Feature Alignment with Temporal Context (FATC) to enhance target domain feature discriminability. FATC exploits the robust source domain features, which are trained with ground truth labels, to learn discriminative target domain features in an unsupervised manner by filtering unreliable predictions with temporal consensus. We demonstrate the effectiveness of the proposed approaches through extensive experiments. In particular, our model reaches mIoU of 53.81% on VIPER to Cityscapes-Seq benchmark and mIoU of 56.31% on SYNTHIA-Seq to Cityscapes-Seq benchmark, surpassing the state-of-the-art methods by large margins.
translated by 谷歌翻译
Neural networks trained with ERM (empirical risk minimization) sometimes learn unintended decision rules, in particular when their training data is biased, i.e., when training labels are strongly correlated with undesirable features. To prevent a network from learning such features, recent methods augment training data such that examples displaying spurious correlations (i.e., bias-aligned examples) become a minority, whereas the other, bias-conflicting examples become prevalent. However, these approaches are sometimes difficult to train and scale to real-world data because they rely on generative models or disentangled representations. We propose an alternative based on mixup, a popular augmentation that creates convex combinations of training examples. Our method, coined SelecMix, applies mixup to contradicting pairs of examples, defined as showing either (i) the same label but dissimilar biased features, or (ii) different labels but similar biased features. Identifying such pairs requires comparing examples with respect to unknown biased features. For this, we utilize an auxiliary contrastive model with the popular heuristic that biased features are learned preferentially during training. Experiments on standard benchmarks demonstrate the effectiveness of the method, in particular when label noise complicates the identification of bias-conflicting examples.
translated by 谷歌翻译
Recently, numerous studies have investigated cooperative traffic systems using the communication among vehicle-to-everything (V2X). Unfortunately, when multiple autonomous vehicles are deployed while exposed to communication failure, there might be a conflict of ideal conditions between various autonomous vehicles leading to adversarial situation on the roads. In South Korea, virtual and real-world urban autonomous multi-vehicle races were held in March and November of 2021, respectively. During the competition, multiple vehicles were involved simultaneously, which required maneuvers such as overtaking low-speed vehicles, negotiating intersections, and obeying traffic laws. In this study, we introduce a fully autonomous driving software stack to deploy a competitive driving model, which enabled us to win the urban autonomous multi-vehicle races. We evaluate module-based systems such as navigation, perception, and planning in real and virtual environments. Additionally, an analysis of traffic is performed after collecting multiple vehicle position data over communication to gain additional insight into a multi-agent autonomous driving scenario. Finally, we propose a method for analyzing traffic in order to compare the spatial distribution of multiple autonomous vehicles. We study the similarity distribution between each team's driving log data to determine the impact of competitive autonomous driving on the traffic environment.
translated by 谷歌翻译
最近的研究通过将基于Trimap的图像垫子的成功扩展到视频域,在视频垫子上取得了长足进展。在本文中,我们将此任务推向了更实用的设置,并提出了仅使用一个用户宣传的Trimap来强制执行视频底表的单个TRIMAP视频效果网络(OTVM)。 OTVM的一个关键是Trimap传播和α预测的关节建模。从基线构架传播和α预测网络开始,我们的OTVM将两个网络与alpha-Trimap修补模块结合在一起,以促进信息流。我们还提出了一种端到端培训策略,以充分利用联合模型。与先前的解耦方法相比,我们的联合建模极大地提高了三张式传播的时间稳定性。我们在两个最新的视频底变基准测试中评估了我们的模型,深度视频垫子和视频图108,以及优于大量利润率的最先进(MSE改善分别为56.4%和56.7%)。源代码和模型可在线获得:https://github.com/hongje/otvm。
translated by 谷歌翻译
本文介绍了高速自治种族的弹性导航和计划算法,Indy自主挑战(IAC)。 IAC是一场具有全尺度自动赛车的竞赛,可驾驶高达290 km/h(180英里/小时)。由于赛车的高速振动,GPS/INS系统很容易降解。这些退化的GPS测量可能会导致严重的定位误差,导致严重的崩溃事故。为此,我们提出了一个强大的导航系统,以实现多传感器融合Kalman过滤器。在这项研究中,我们介绍了如何根据概率方法确定测量的降解。基于这种方法,我们可以计算Kalman滤波器校正步骤的最佳测量值。同时,我们介绍了其他弹性导航系统,以便赛车可以在致命的定位失败情况下跟随赛道。此外,本文还涵盖了避免障碍的最佳路径计划算法。为了考虑原始的最佳赛车线,障碍物,车辆动力学,我们提出了一种基于路面的路径规划算法,以确保我们的赛车驾驶在结合的条件下。在实验中,我们将评估我们设计的本地化系统可以处理退化的数据,有时还可以在高速驾驶时防止严重的崩溃事故。此外,我们将描述如何成功完成避免障碍挑战。
translated by 谷歌翻译
无监督的视频对象分段(UVOS)是每个像素二进制标记问题,其目的在于在视频中的背景中分离前景对象而不使用前景对象的地面真理(GT)掩码。大多数以前的UVOS模型使用第一帧或整个视频作为参考帧来指定前景对象的掩码。我们的问题是为什么应该选择第一帧作为参考帧,或者为什么应使用整个视频来指定掩码。我们认为我们可以选择更好的参考帧来实现比仅使用第一帧或整个视频作为参考帧的更好的UVOS性能。在我们的论文中,我们提出了简单的框架选择器(EFS)。 EFS使我们能够选择“简单”参考帧,使后续VOS变得容易,从而提高VOS性能。此外,我们提出了一个名为迭代掩模预测(IMP)的新框架。在框架中,我们重复将EFS应用于给定视频,并从视频中选择“更容易”的参考帧,而不是先前的迭代,从而逐步增加VOS性能。该解压缩包括EFS,双向掩模预测(BMP)和时间信息更新(TIU)。从提出的框架,我们在三个UVOS基准集合中实现最先进的性能:Davis16,FBMS和Segtrack-V2。
translated by 谷歌翻译
在本文中,我们开发了一种高效的回顾性深度学习方法,称为堆叠U-网,具有自助前沿,解决MRI中刚性运动伪影的问题。拟议的工作利用损坏的图像本身使用额外的知识前瞻,而无需额外的对比度数据。所提出的网络通过共享来自相同失真对象的连续片的辅助信息来学习错过的结构细节。我们进一步设计了一种堆叠的U-网的细化,便于保持图像空间细节,从而提高了像素到像素依赖性。为了执行网络培训,MRI运动伪像的模拟是不可避免的。我们使用各种类型的图像前瞻呈现了一个密集的分析:来自同一主题的其他图像对比的提出的自助前锋和前锋。实验分析证明了自助前锋的有效性和可行性,因为它不需要任何进一步的数据扫描。
translated by 谷歌翻译
Artificial intelligence (AI) and robotic coaches promise the improved engagement of patients on rehabilitation exercises through social interaction. While previous work explored the potential of automatically monitoring exercises for AI and robotic coaches, the deployment of these systems remains a challenge. Previous work described the lack of involving stakeholders to design such functionalities as one of the major causes. In this paper, we present our efforts on eliciting the detailed design specifications on how AI and robotic coaches could interact with and guide patient's exercises in an effective and acceptable way with four therapists and five post-stroke survivors. Through iterative questionnaires and interviews, we found that both post-stroke survivors and therapists appreciated the potential benefits of AI and robotic coaches to achieve more systematic management and improve their self-efficacy and motivation on rehabilitation therapy. In addition, our evaluation sheds light on several practical concerns (e.g. a possible difficulty with the interaction for people with cognitive impairment, system failures, etc.). We discuss the value of early involvement of stakeholders and interactive techniques that complement system failures, but also support a personalized therapy session for the better deployment of AI and robotic exercise coaches.
translated by 谷歌翻译